NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Critical windows: non-asymptotic theory for feature emergence in diffusion models

Li, Marvin; Chen, Sitan (July 2024, International Conference on Machine Learning)

We develop theory to understand an intriguing property of diffusion models for image generation that we term critical windows. Empirically, it has been observed that there are narrow time intervals in sampling during which particular features of the final image emerge, e.g. the image class or background color (Ho et al., 2020b; Meng et al., 2022; Choi et al., 2022; Raya & Ambrogioni, 2023; Georgiev et al., 2023; Sclocchi et al., 2024; Biroli et al., 2024). While this is advantageous for interpretability as it implies one can localize properties of the generation to a small segment of the trajectory, it seems at odds with the continuous nature of the diffusion. We propose a formal framework for studying these windows and show that for data coming from a mixture of strongly log-concave densities, these windows can be provably bounded in terms of certain measures of inter- and intra-group separation. We also instantiate these bounds for concrete examples like well-conditioned Gaussian mixtures. Finally, we use our bounds to give a rigorous interpretation of diffusion models as hierarchical samplers that progressively "decide" output features over a discrete sequence of times. We validate our bounds with synthetic experiments. Additionally, preliminary experiments on Stable Diffusion suggest critical windows may serve as a useful tool for diagnosing fairness and privacy violations in real-world diffusion models.
more » « less
Full Text Available
Optimal high-precision shadow estimation

Chen, Sitan; Li, Jerry; Liu, Allen (July 2024, https://doi.org/10.48550/arXiv.2407.13874)

The authors provide the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space. Specifically, they present a protocol that, given any 𝑚 ∈ 𝑁 m∈N and 𝜖 ≤ 𝑂 ( 𝑑 − 1 / 2 ) ϵ≤O(d −1/2 ), measures 𝑂 ( log ⁡ ( 𝑚 ) / 𝜖 2 ) O(log(m)/ϵ 2 ) copies of an unknown mixed state 𝜌 ∈ 𝐶 𝑑 × 𝑑 ρ∈C d×d and outputs a classical description of 𝜌 ρ. This description can then be used to estimate any collection of 𝑚 m observables to within additive accuracy 𝜖 ϵ. Previously, even for the simpler case of shadow tomography where observables are known in advance, the best known rates either scaled benignly but suboptimally in all of 𝑚 , 𝑑 , 𝜖 m,d,ϵ, or scaled optimally in 𝜖 , 𝑚 ϵ,m but included additional polynomial factors in 𝑑 d. Interestingly, the authors also show via dimensionality reduction that one can rescale 𝜖 ϵ and 𝑑 d to reduce to the regime where 𝜖 ≤ 𝑂 ( 𝑑 − 1 / 2 ) ϵ≤O(d −1/2 ). Their algorithm draws on representation-theoretic tools developed in the context of full state tomography.
more » « less
Full Text Available
Predicting quantum channels over general product distributions

Chen, Sitan; de_Dios_Pont, Jaume; Hsieh, Jun-Ting; Huang, Hsin-Yuan; Lange, Jane; Li, Jerry (September 2024, https://doi.org/10.48550/arXiv.2409.03684)

We investigate the problem of predicting the output behavior of unknown quantum channels. Given query access to an n-qubit channel E and an observable O, we aim to learn the mapping ρ↦Tr(OE[ρ]) to within a small error for most ρ sampled from a distribution D. Previously, Huang, Chen, and Preskill proved a surprising result that even if E is arbitrary, this task can be solved in time roughly nO(log(1/ϵ)), where ϵ is the target prediction error. However, their guarantee applied only to input distributions D invariant under all single-qubit Clifford gates, and their algorithm fails for important cases such as general product distributions over product states ρ. In this work, we propose a new approach that achieves accurate prediction over essentially any product distribution D, provided it is not "classical" in which case there is a trivial exponential lower bound. Our method employs a "biased Pauli analysis," analogous to classical biased Fourier analysis. Implementing this approach requires overcoming several challenges unique to the quantum setting, including the lack of a basis with appropriate orthogonality properties. The techniques we develop to address these issues may have broader applications in quantum information.
more » « less
Full Text Available
The probability flow ODE is provably fast

Chen, Sitan; Chewi, Sinho; Lee, Holden; Li, Yuanzhi; Lu, Jianfeng; Salim, Adil (May 2024, Proceedings of the 37th International Conference on Neural Information Processing Systems)

Full Text Available
Learning Mixtures of Gaussians Using the DDPM Objective

Shah, Kulin; Chen, Sitan; Klivans, Adam (December 2023, NeurIPS)

Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gradient-based algorithms for this task can provably succeed. In this work, we give the first provably efficient results along these lines for one of the most fundamental distribution families, Gaussian mixture models. We prove that gradient descent on the denoising diffusion probabilistic model (DDPM) objective can efficiently recover the ground truth parameters of the mixture model in the following two settings: 1) We show gradient descent with random initialization learns mixtures of two spherical Gaussians in d dimensions with 1/poly(d)-separated centers. 2) We show gradient descent with a warm start learns mixtures of K spherical Gaussians with Ω(log(min(K,d)))-separated centers. A key ingredient in our proofs is a new connection between score-based methods and two other approaches to distribution learning, the EM algorithm and spectral methods
more » « less
Full Text Available
Learning Mixtures of Gaussians Using the DDPM Objective

Shah, Kulin; Chen, Sitan; Klivans, Adam R (December 2023, Neural Information Processing Systems (NeurIPS 2023))

Full Text Available
Learning to Predict Arbitrary Quantum Processes

https://doi.org/10.1103/PRXQuantum.4.040337

Huang, Hsin-Yuan; Chen, Sitan; Preskill, John (December 2023, PRX Quantum)

Full Text Available
The complexity of NISQ

https://doi.org/10.1038/s41467-023-41217-6

Chen, Sitan; Cotler, Jordan; Huang, Hsin-Yuan; Li, Jerry (December 2023, Nature Communications)

The recent proliferation of NISQ devices has made it imperative to understand their power. In this work, we define and study the complexity class , which encapsulates problems that can be efficiently solved by a classical computer with access to noisy quantum circuits. We establish super-polynomial separations in the complexity among classical computation, , and fault-tolerant quantum computation to solve some problems based on modifications of Simon’s problems. We then consider the power of for three well-studied problems. For unstructured search, we prove that cannot achieve a Grover-like quadratic speedup over classical computers. For the Bernstein-Vazirani problem, we show that only needs a number of queries logarithmic in what is required for classical computers. Finally, for a quantum state learning problem, we prove that is exponentially weaker than classical computers with access to noiseless constant-depth quantum circuits.
more » « less
Full Text Available
The probability flow ODE is provably fast

Chen, Sitan; Chewi, Sinho; Lee, Holden; Li, Yuanzhi; Lu, Jianfeng; Salim, Adil (December 2023, Neural Information Processing Systems (NeurIPS 2023))

Full Text Available
The probability flow ODE is provably fast

Chen, Sitan; Chewi, Sinho; Lee, Holden; Li, Yuanzhi; Lu, Jianfeng; Salim, Adil (December 2023, Neural Information Processing Systems (NeurIPS 2023))

« Prev Next »

Search for: All records